AITopics | explanation robustness

Collaborating Authors

explanation robustness

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Robust Explanations of Graph Neural Networks via Graph Curvatures

Neural Information Processing SystemsJun-11-2026, 09:35:27 GMT

Explaining graph neural networks (GNNs) is a key approach to improve the trustworthiness of GNN in high-stakes applications, such as finance and healthcare. However, existing methods are vulnerable to perturbations, raising concerns about explanation reliability. Prior methods enhance explanation robustness using model retraining or explanation ensemble, with certain weaknesses. Retraining leads to models that are different from the original target model and misleading explanations, while ensemble can produce contradictory results due to different inputs or models. To improve explanation robustness without the above weaknesses, we take an unexplored route and exploit the two edge geometry properties curvature and resistance to enhance explanation robustness. We are the first to prove that these geometric notions can be used to bound explanation robustness. We design a general optimization algorithm to incorporate these geometric properties into a wide spectrum of base GNN explanation methods to enhance the robustness of base explanations. We empirically show that our method outperforms six base explanation methods in robustness across nine datasets spanning node classification, link prediction, and graph classification tasks, improving fidelity in 80\% of the cases and achieving up to a 10\% relative improvement in robust performance.

artificial intelligence, machine learning, robustness, (8 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.60)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.30)

Add feedback

0626822954674a06ccd9c234e3f0d572-Paper-Conference.pdf

Neural Information Processing SystemsFeb-7-2026, 10:25:15 GMT

explanation, robustness, thickness, (15 more...)

Neural Information Processing Systems

Country:

Asia > China > Guangdong Province > Guangzhou (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
Europe > Italy > Marche > Ancona Province > Ancona (0.04)
(7 more...)

Genre: Research Report > Experimental Study (0.67)

Industry:

Government (0.46)
Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(3 more...)

Add feedback

0626822954674a06ccd9c234e3f0d572-Paper-Conference.pdf

Neural Information Processing SystemsOct-9-2025, 17:42:47 GMT

explanation, robustness, thickness, (15 more...)

Neural Information Processing Systems

Country:

Asia > China > Guangdong Province > Guangzhou (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
Europe > Italy > Marche > Ancona Province > Ancona (0.04)
(7 more...)

Genre: Research Report > Experimental Study (0.67)

Industry:

Government (0.46)
Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(3 more...)

Add feedback

Adversarial Attack for Explanation Robustness of Rationalization Models

Zhang, Yuankai, Kong, Lingxiao, Wang, Haozhao, Li, Ruixuan, Wang, Jun, Li, Yuhua, Liu, Wei

arXiv.org Artificial IntelligenceSep-19-2024

Rationalization models, which select a subset of input text as rationale-crucial for humans to understand and trust predictions-have recently emerged as a prominent research area in eXplainable Artificial Intelligence. However, most of previous studies mainly focus on improving the quality of the rationale, ignoring its robustness to malicious attack. Specifically, whether the rationalization models can still generate high-quality rationale under the adversarial attack remains unknown. To explore this, this paper proposes UAT2E, which aims to undermine the explainability of rationalization models without altering their predictions, thereby eliciting distrust in these models from human users. UAT2E employs the gradient-based search on triggers and then inserts them into the original input to conduct both the non-target and target attack. Experimental results on five datasets reveal the vulnerability of rationalization models in terms of explanation, where they tend to select more meaningless tokens under attacks. Based on this, we make a series of recommendations for improving rationalization models in terms of explanation.

rationale, rationalization model, robustness, (14 more...)

arXiv.org Artificial Intelligence

2408.10795

Country:

Europe > Spain > Galicia > Madrid (0.05)
Europe > United Kingdom > England > Greater London > London (0.04)
Europe > Belgium > Flanders (0.04)
(13 more...)

Genre: Research Report (1.00)

Industry:

Media > Television (1.00)
Media > Film (1.00)
Leisure & Entertainment (1.00)
(3 more...)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Add feedback

Are Classification Robustness and Explanation Robustness Really Strongly Correlated? An Analysis Through Input Loss Landscape

Chen, Tiejin, Huang, Wenwang, Pang, Linsey, Luo, Dongsheng, Wei, Hua

arXiv.org Artificial IntelligenceMar-9-2024

This paper delves into the critical area of deep learning robustness, challenging the conventional belief that classification robustness and explanation robustness in image classification systems are inherently correlated. Through a novel evaluation approach leveraging clustering for efficient assessment of explanation robustness, we demonstrate that enhancing explanation robustness does not necessarily flatten the input loss landscape with respect to explanation loss - contrary to flattened loss landscapes indicating better classification robustness. To deeply investigate this contradiction, a groundbreaking training method designed to adjust the loss landscape with respect to explanation loss is proposed. Through the new training method, we uncover that although such adjustments can impact the robustness of explanations, they do not have an influence on the robustness of classification. These findings not only challenge the prevailing assumption of a strong correlation between the two forms of robustness but also pave new pathways for understanding relationship between loss landscape and explanation loss.

classification robustness, explanation robustness, robustness, (11 more...)

arXiv.org Artificial Intelligence

2403.06013

Country: North America > United States > Arizona (0.04)

Genre: Research Report > New Finding (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)

Add feedback

Robust Ranking Explanations

Chen, Chao, Guo, Chenghua, Ma, Guixiang, Zeng, Ming, Zhang, Xi, Xie, Sihong

arXiv.org Artificial IntelligenceJul-8-2023

Robust explanations of machine learning models are critical to establish human trust in the models. Due to limited cognition capability, most humans can only interpret the top few salient features. It is critical to make top salient features robust to adversarial attacks, especially those against the more vulnerable gradient-based explanations. Existing defense measures robustness using $\ell_p$-norms, which have weaker protection power. We define explanation thickness for measuring salient features ranking stability, and derive tractable surrogate bounds of the thickness to design the \textit{R2ET} algorithm to efficiently maximize the thickness and anchor top salient features. Theoretically, we prove a connection between R2ET and adversarial training. Experiments with a wide spectrum of network architectures and data modalities, including brain networks, demonstrate that R2ET attains higher explanation robustness under stealthy attacks while retaining accuracy.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2307.04024

Country:

Europe > Italy > Marche > Ancona Province > Ancona (0.04)
Asia > China > Beijing > Beijing (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
(2 more...)

Genre: Research Report (0.40)

Industry:

Government (0.66)
Health & Medicine > Health Care Technology (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)

Add feedback

Provable Robust Saliency-based Explanations

Chen, Chao, Guo, Chenghua, Ma, Guixiang, Zeng, Ming, Zhang, Xi, Xie, Sihong

arXiv.org Artificial IntelligenceJul-8-2023

Robust explanations of machine learning models are critical to establishing human trust in the models. The top-$k$ intersection is widely used to evaluate the robustness of explanations. However, most existing attacking and defense strategies are based on $\ell_p$ norms, thus creating a mismatch between the evaluation and optimization objectives. To this end, we define explanation thickness for measuring top-$k$ salient features ranking stability, and design the \textit{R2ET} algorithm based on a novel tractable surrogate to maximize the thickness and stabilize the top salient features efficiently. Theoretically, we prove a connection between R2ET and adversarial training; using a novel multi-objective optimization formulation and a generalization error bound, we further prove that the surrogate objective can improve both the numerical and statistical stability of the explanations. Experiments with a wide spectrum of network architectures and data modalities demonstrate that R2ET attains higher explanation robustness under stealthy attacks while retaining model accuracy.

artificial intelligence, explanation, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2212.14106

Country:

Asia > Middle East > Jordan (0.04)
Europe > Italy > Marche > Ancona Province > Ancona (0.04)

Genre: Research Report (0.63)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Robust Explanation Constraints for Neural Networks

Wicker, Matthew, Heo, Juyeon, Costabello, Luca, Weller, Adrian

arXiv.org Artificial IntelligenceDec-16-2022

Post-hoc explanation methods are used with the intent of providing insights about neural networks and are sometimes said to help engender trust in their outputs. However, popular explanations methods have been found to be fragile to minor perturbations of input features or model parameters. Relying on constraint relaxation techniques from non-convex optimization, we develop a method that upper-bounds the largest change an adversary can make to a gradient-based explanation via bounded manipulation of either the input features or model parameters. By propagating a compact input or parameter set as symbolic intervals through the forwards and backwards computations of the neural network we can formally certify the robustness of gradient-based explanations. Our bounds are differentiable, hence we can incorporate provable explanation robustness into neural network training. Empirically, our method surpasses the robustness provided by previous heuristic approaches. We find that our training method is the only method able to learn neural networks with certificates of explanation robustness across all six datasets tested.

artificial intelligence, explanation, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2212.08507

Country:

North America > United States (0.14)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (1.00)

Industry:

Health & Medicine (0.68)
Information Technology > Security & Privacy (0.46)
Government > Military (0.46)
Government > Regional Government (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback